expression simplification
131f383b434fdf48079bff1e44e2d9a5-AuthorFeedback.pdf
See Table 1for the average running time per problem instance. Note that the implementation of Z3 and OR-tools22 are in C++, while NeuRewriter and RL baselines are in Python. Still, we can observethat our approach achieves a23 better balance between the time-efficiency and the result quality. For expression simplification and job scheduling,24 NeuRewriter is even more time-efficient than Z3 and OR-tools. The region-pickerฯฯ is parameterized by aQ-function and is similar in spirit to soft-Q learning [2].
Learning to Progressively Plan
For problem solving, making reactive decisions based on problem description is fast but inaccurate, while search-based planning using heuristics gives better solutions but could be exponentially slow. In this paper, we propose a new approach that improves an existing solution by iteratively picking and rewriting its local components until convergence. The rewriting policy employs a neural network trained with reinforcement learning. We evaluate our approach in two domains: job scheduling and expression simplification. Compared to common effective heuristics, baseline deep models and search algorithms, our approach efficiently gives solutions with higher quality.